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Abstract. This article represents an extension of [Tabirca, 2000a]. A new 
equation for upper bounds is obtained based on the Smarandache f-inferior part 
function. An example involving upper diagonal matrices is given in order to 
illustrate that the new equation provide a better computation. 


LINTRODUCTION 


Loop imbalance is the most important overhead in many parallel applications. Because loop 
structures represents the main source of parallelism, the scheduling of parallel loop iterations 
to processors can determine its decreasing. Among the many method for loop scheduling, the 
load balance scheduling is a recent one and was proposed by Bull [1998] and developed by 
Freeman et.al. [1999, 2000]. Tabirca [2000] studied this method and proposed an equation 
for the case when the work is distributed to all the processors. 


Consider that there are p processors denoted in the following by P;, P2, ..., P, and a single 
parallel loop (see Figure 1.). 
do paraliel i=1,n 


call loop_body(i); 
end do 


Figure 1. Single Parallel Loop 
We also assume that the work of the routine loop_body(i) can be evaluated and is given by 


the function w: N —» R, where w(i)= w, represents the number of routine’s operations or - 


its running time (assume that w(0)=0). The total amount of work for the parallel loop is 


Y wl). The efficient loop-scheduling algorithm distributes equally this total amount of 


ist 
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work on processors such that a processor receives a quantity of work equal to a by w({Z). 


i=} 


Let I and h, be the lower and upper bounds, j =1,2,..., p , such that processor j executes all 


the iterations between / ,and 4, . These bounds are found distributing equally the work on 


processors by using 
a or oe 
> wi) =—- ¥ wa (= 12...., p). (1) 
isl, isl 
Moreover, they satisfy the following equations 
1, =1. (2.a) 
Vi 1 < pa aes 
if we know /,, then h, is given by ) wi)=—- Dwi) =W . (2.b) 
: ist, P isi 
Lia =A, +1. (2.c) 


Suppose that Equation (2.b) is computed by a less approximation. This means that if we have 
the value J, , then we find h ; as follows: 


h — ‘Atl ; 
h,=h = LMSW < dw) P (3) 


i=l; 


The Smarandache f-inferior part function represents a generalisation of the inferior part 

function [,]: RZ, [x]J=kK Ok Sx<k+1.l1f f:ZOR isa strict increasing function 

that satisfies lim f(n)=-—co and lim f(m)=0, then the Smarandache finferior part 
2-7-0 a—0 


function denoted by f,,: R->Z is defined by [see www.gallup.unm.edu/~smarandache] 
u 


Sy ako f(k)sx< f(k+)). (4) 
Tabirca [2000a] presented some Smarandache f-inferior part functions for which 


rE 
S(A)= Yi * . They are presented in the following: 


i=} 


k — a 7 
Fe)= yi => (=| Nv z0, (5) 
fB=S2 = f(x) =b@)] vx 20, 6) 


i=] 


= : 
Er eng oem a | kee 2 ree a ee ee 
2 2 2 1728 2 2 1728 
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Tabirca [2000] also proposed an equation for the upper bounds of the load balance scheduling 
method based on the Smarandache f-inferior part function. If the work w satisfies certain 
conditions [Tabirca, 2000], then the upper bounds are given by 

Ho = fj) j= 12,07. ©) 
Moreover, Tabirca [2000a] applied this method to the product between an upper diagonal 
matrix and a vector. It was proved that the load balance scheduling method offers the lowest 
Tunning time in comparison with other static scheduling methods [Tabirca, 2000b]. 


2. A NEW EQUATION FOR THE UPPER BOUNDS 
In this section, a new equation for the upper bounds is introduced. Some theoretical 
considerations about the new equation and Equation (7) are also made. Consider that 


k 
f:N-R is defined by f(h=> w,, f(0)=0. For the work w, we assume the 


ist 


following [Tabirca, 2000]: 


Al: w, SES joi: 
P 


i=} 


A2: There are equations for the functions Didqe: 


Theorem 1. The upper bounds of the load balance scheduling method are given by 

bh? = fy (F021) 47) j=12,.0p. 8) 
Proof. For easiness we denote in the following h, =h?), Equation (3) gives the ied 
bounds of the load balance aoonee method. We start from the equation 


Aytl 
> wis <We< Dd) wii) and add f(h,_,) => w, toall the sides 
i=l, ist i=} 
Awl Ss 
Sm< f(h,.)+7< EM) 
ist 
Based on the definition of f;, , we find that A sah (ru: +H). + 


The following theorem illustrates how these bounds are. 
Theorem 2. h® < Ae, j=1,2,...,p. 
Proof. Recall that these two upper bounds satisfy 


nits RY 41 


yw, <j- We Sw, (9.a) 


f=] ist 
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a aH 


yw, <Ww< Yw,. (9.b) 


ih?) inf?) 


All the sums from Equation (9.b.) are added finding 


QQ 
hj 


¥ yu <i Po vwspW. 


f=] g=f) i=l 


Because his the last index satisfying Equation (9.a) we find that h < h” holds. ¢ 


Consequence: f/f: (A) <fi (AM )< j W, j=1,2,...5p- 
This consequence obviously comes from the monotony of f and the definition of the bounds. 


Now, we have two equations for the upper bounds of the load balance scheduling method. 
Equation (8) was obtained naturally by starting from the definition of the load balance. It 
reflects that case when several load balances are performed consecutively. Equation (7) was 


found by considering the last partial sum that is under j -W . This option does not consider 


any load balance such that we expect it to be not quit efficient. Moreover, it is difficult to 
predict which equation is the best or is better to use it of a given computation. The best 
practical advice is to apply both of them and to choose the one, which gives the lowest times. 


3. COMPUTATIONAL RESULTS 
In this section we present an example for the load balance scheduling method. This example 
deals with the product between an upper diagonal matrix and a vector [Jaja, 1992]. All the 
computations have been performed on SGI Power Challenge 2000 parallel machine with 16 
processors. The dimension of the matrix was n=300. . 
DO PARALLEL i=1,n 

Yi =i ° 

DO j=2,i 

Yi FY, +4; 5° X; 


END DO 
END DO 


Figure 2. Parallel Computation for the Upper Matrix — Vector Product. 
Recall that a=(a,,), = €M,(R) is upper diagonal if a, ,=0,i< j. The product 


i,j=3,a 


y=a-x between an upper diagonal matrix a=(4,,), qe MCR) and a vector 


xe R" is given by 


y= 34, °x, Vi=12,.0- (10) 
d= 
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The parallel computation of Equation (10) is shown in Figure 2. 


The work of iteration i is given by w(i)=i,i=1,2,...,2. We have that the total work is 


tf (=D ee and Wan The Smarandache f-inferior function is 
i=l “Pp 
ty ya [ ee Vx20. Therefore, the upper bounds of the load balance 
scheduling method are given by 
—|+ tee 
P i 
i aaa Terenas ba or (11) 
—L+ fi+4-n Ge ene4 ZOD 
h® = ft jad. p. (12) 
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The running times for these two types of upper bounds are presented in Table 1. Figure 3 
proves that these two types of bounds for the load balance scheduling are comparable the 


same, 


P=! P=2 P=3 P=6 P=8 
h he 1.847 1.347 0.987 0.750 0.482 
Table 1. Times of the computation. 


4, FINAL CONCLUSSION 

An important remark that can be outlined is the Smarandache inferior part function was 
applied successfully to solve an important scheduling problem. Based on it, two equations for 
the upper bounds of the load balance scheduling methods have been found. These equations 
have been used to solve the product between an upper diagonal matrix and vector and the 
computational times were quite similar. The upper bounds given by the new equation have 
provided a better computation for this problem. 
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P=2  P=3  P=6 


Figure 3. Graphics of the Running Times. 
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